diffusion network
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
One-Step Effective Diffusion Network for Real-World Image Super-Resolution
The pre-trained text-to-image diffusion models have been increasingly employed to tackle the real-world image super-resolution (Real-ISR) problem due to their powerful generative image priors. Most of the existing methods start from random noise to reconstruct the high-quality (HQ) image under the guidance of the given low-quality (LQ) image. While promising results have been achieved, such Real-ISR methods require multiple diffusion steps to reproduce the HQ image, increasing the computational cost. Meanwhile, the random noise introduces uncertainty in the output, which is unfriendly to image restoration tasks. To address these issues, we propose a one-step effective diffusion network, namely OSEDiff, for the Real-ISR problem.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Asia > Middle East > Jordan (0.04)
- (2 more...)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Communications > Networks (1.00)
- Information Technology > Data Science > Data Mining (0.94)
- (3 more...)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Communications > Networks (1.00)
- Information Technology > Data Science > Data Mining (0.94)
- (3 more...)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- (4 more...)
Anytime Influence Bounds and the Explosive Behavior of Continuous-Time Diffusion Networks
Kevin Scaman, Rémi Lemonnier, Nicolas Vayatis
The paper studies transition phenomena in information cascades observed along a diffusion process over some graph. We introduce the Laplace Hazard matrix and show that its spectral radius fully characterizes the dynamics of the contagion both in terms of influence and of explosion time. Using this concept, we prove tight non-asymptotic bounds for the influence of a set of nodes, and we also provide an in-depth analysis of the critical time after which the contagion becomes super-critical. Our contributions include formal definitions and tight lower bounds of critical explosion time. We illustrate the relevance of our theoretical results through several examples of information cascades used in epidemiology and viral marketing models. Finally, we provide a series of numerical experiments for various types of networks which confirm the tightness of the theoretical bounds.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- North America > United States (0.04)
- Europe > France > Île-de-France > Paris > Paris (0.04)
- North America > United States > New York > New York County > New York City (0.05)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- (5 more...)
DiffSoundStream: Efficient Speech Tokenization via Diffusion Decoding
Yang, Yang, Li, Yunpeng, Sung, George, Shih, Shao-Fu, Dooley, Craig, Centazzo, Alessio, Rajeswaran, Ramanan
Token-based language modeling is a prominent approach for speech generation, where tokens are obtained by quantiz-ing features from self-supervised learning (SSL) models and extracting codes from neural speech codecs, generally referred to as semantic tokens and acoustic tokens. These tokens are often modeled autoregressively, with the inference speed being constrained by the token rate. In this work, we propose DiffSoundStream, a solution that improves the efficiency of speech tokenization in non-streaming scenarios through two techniques: (1) conditioning the neural codec on semantic tokens to minimize redundancy between semantic and acoustic tokens, and (2) leveraging latent diffusion models to synthesize high-quality waveforms from semantic and coarse-level acoustic tokens. Experiments show that at 50 tokens per second, Diff-SoundStream achieves speech quality on par with a standard SoundStream model operating at twice the token rate. Additionally, we achieve step-size distillation using just four diffusion sampling steps with only a minor quality loss.